Non-independence in statistical tests for discrete cross-species data.

نویسندگان

  • A Graffen
  • M Ridley
چکیده

The paper described three previously undetected effects, due to biases and non-independence, that can arise in statistical tests for associations between character states in cross-species data. One kind, which we call the family problem, is general to all known methods. In phytogenetic data, the ancestral character state from which changes occur, or below which variation is found, is likely to be the same for many regions of the tree. The family problem interacts with two kinds of non-independence that arise because of the methods of reconstruction of character states that existing tests use. Different kinds of non-independence arise in methods that reconstruct joint, or single, character states, respectively. Methods, like Ridley's (1983), that work with joint character states suffer from the problem that a character state cannot change to itself with parsimony. Other methods that work with single character states suffer from the problem that within a locally variable region of the tree it is more likely with null data that there will be two single changes in the two characters in separate branches than one double change in both; associations opposite to the locally ancestral state are therefore likely to be found in more than 50% of the variable regions. In real data sets, the family problem acts to spotlight the other kinds of bias: if the family problem is large the bias in tests due to the way they reconstruct characters will be large, whereas if it is small, the local biases tend to cancel and disappear in the aggregate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Tests for Discrete Cross-species Data

Four methods have been proposed that can be used to test for associations between the states of discrete characters in cross-species data and that do not suffer from non-independence due to overcounting of data points. The tests are those of Ridley (1983), Burt (1989), Grafen (1989), and a new test called the ICDE test. The aim of the paper is to measure the Type I error rates for these methods...

متن کامل

Modeling Nonnegative Data with Clumping at Zero: A Survey

Applications in which data take nonnegative values but have a substantial proportion of values at zero occur in many disciplines. The modeling of such “clumped-at-zero” or “zero-inflated” data is challenging. We survey models that have been proposed. We consider cases in which the response for the non-zero observations is continuous and in which it is discrete. For the continuous and then the d...

متن کامل

Speeding up the execution of a large number of statistical tests of independence

A massive amount of conditional independence tests on data must be performed in the problem of learning the structure of probabilistic graphical models when using the independence-based approach. An intermediate step in the computation of independence tests is the construction of contingency tables from the data. In this work we present an intelligent cache of contingency tables that allows the...

متن کامل

Non-causality in bivariate binary time series

In this paper we develop a dynamic discrete-time bivariate probit model, in which the conditions for Granger non-causality can be represented and tested. The conditions for simultaneous independence are also worked out. The model is extended in order to allow for covariates, representing individual as well as time heterogeneity. The proposed model can be estimated by Maximum Likelihood. Granger...

متن کامل

Avoiding non-independence in fMRI data analysis: Leave one subject out

Concerns regarding certain fMRI data analysis practices have recently evoked lively debate. The principal concern regards the issue of non-independence, in which an initial statistical test is followed by further non-independent statistical tests. In this report, we propose a simple, practical solution to reduce bias in secondary tests due to non-independence using a leave-one-subject-out (LOSO...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of theoretical biology

دوره 188 4  شماره 

صفحات  -

تاریخ انتشار 1997